Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CPython 3.11, 3.12, and aarch64 processors #2331

Open
wants to merge 69 commits into
base: master
Choose a base branch
from

Conversation

ddelange
Copy link

@ddelange ddelange commented Jan 20, 2023

Hoi 👋

linux-aarch64 makes up for almost 10% of all platforms ref giampaolo/psutil#2103

aarch64 has already surpassed windows in terms of downloads for this package. Oracle, Amazon, Google, and Microsoft are all offering aarch64 cloud instances at an undeniable price point compared to amd/intel, so the demand will undoubtedly only grow

  • this PR is adapted from Add arm64 mac and linux wheels MagicStack/asyncpg#954
  • uses QEMU emulation for linux arm64 wheels: manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs 😅
  • manylinux2014 wheels are built with GCC 10, which I think does not guarantee proper functioning of pybind11 (docs).
    • so with this PR, linux wheels are built with GCC 12 (manylinux_2_28).
    • pip will only install these wheels on linux operating systems with glibc >= 2.28 (mostly all 2020+ linux distributions like debian 10 buster, ubuntu 20.04 focal, almalinux/rhel 8, ...).

the wheels from this PR can be installed with:

# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
pip install --force-reinstall vaex

fixes #2366, fixes #2368, fixes #2397

@maartenbreddels
Copy link
Member

Hoi 👋

exciting, will take a look early next week!

  • manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs

that worries me a bit.. :)

groeten,

Maarten

@ddelange
Copy link
Author

ddelange commented Jan 21, 2023

here are all timings: https://github.com/ddelange/vaex/actions/runs/3965720337/usage

depending on how often a month you release vaex, this could eat into the 2k free minutes of GH...

as the parallelization is maximised and they're pushed to PyPI as soon as they're built, most of the wheels will be available soon upon release regardless

here are all the wheels: distributions.zip

@ddelange
Copy link
Author

interestingly, that was 8260 minutes ^

apparently that's OK? then I don't understand their explanation 🤔 https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#included-storage-and-minutes

@ddelange
Copy link
Author

ddelange commented Jan 21, 2023

ah there is a fair amount of duplication in that usage table for whatever reason 🤯

@ddelange
Copy link
Author

a diff of current PyPI vs the zip above:

 vaex_core-4.16.1-cp310-cp310-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp310-cp310-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-win_amd64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_10_9_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_11_0_arm64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-win_amd64.whl
 vaex_core-4.16.1-cp36-cp36m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp36-cp36m-win_amd64.whl
 vaex_core-4.16.1-cp37-cp37m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp37-cp37m-win_amd64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp38-cp38-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-win_amd64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp39-cp39-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-win_amd64.whl

Comment on lines -16 to -23
namespace std {
template<>
struct hash<PyObject*> {
size_t operator()(const PyObject *const &o) const {
return PyObject_Hash((PyObject*)o);
}
};
}
Copy link
Author

@ddelange ddelange Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maartenbreddels any thoughts on this (incl me updating the pybind11 submodule)?

@@ -183,12 +183,14 @@ def __str__(self):
include_package_data=True,
ext_modules=([extension_vaexfast] if on_rtd else [extension_vaexfast, extension_strings, extension_superutils, extension_superagg]) if not use_skbuild else [],
zip_safe=False,
python_requires=">=3.6",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cibuildwheel parses this to determine which wheels to build

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @franz101

see also the diff above

@ddelange
Copy link
Author

I'm guessing this is blocked by #2339

@maartenbreddels
Copy link
Member

Just letting you know i'm very busy and had a vacation.
Yes, I'll try to get #2339 green first!

@ddelange
Copy link
Author

fwiw there are now third party free minutes on native arm64 machines, to get rid of the slow qemu builds

@ddelange ddelange changed the title Build aarch64 wheels Build aarch64 wheels and support python 3.11 Jul 10, 2023
@maartenbreddels
Copy link
Member

Could you try rebasing this?

@ddelange
Copy link
Author

@maartenbreddels already merged in master 👍

@ddelange
Copy link
Author

    ERROR: Could not find a version that satisfies the requirement vaex-core<4.17,>=4.17.0 (from vaex)
    ERROR: No matching distribution found for vaex-core<4.17,>=4.17.0

@maartenbreddels
Copy link
Member

Yeah, a bug/artifact or our release script. Should be good now.

@ddelange
Copy link
Author

ddelange commented Aug 3, 2023

hoi @maartenbreddels 👋

I pulled master and fixed merge conflicts, but it looks like CI is still not very happy. Seeing errors like hdf file missing on disk, and TypeError: train() got an unexpected keyword argument 'early_stopping_rounds'.

Do you think it might be related to this PR?

ddelange referenced this pull request in rapidfuzz/RapidFuzz Aug 10, 2023
@franz101
Copy link
Contributor

Just wondering here on the Python packaging. Python 3.6 and 3.7 are now deprecated on the other hand we can bump to 3.10 and 3.11?

@to-bee
Copy link

to-bee commented Aug 28, 2023

Do we have any updates on this MR?

@ddelange
Copy link
Author

ddelange commented Sep 1, 2023

HI @maartenbreddels 👋

Was your s3 account deleted by any chance?

vaex.open('s3://vaex/taxi/yellow_taxi_2009_2015_f32.hdf5?anon=true')

raises

FileNotFoundError: [Errno 2] Path does not exist 'vaex/taxi/yellow_taxi_2009_2015_f32.hdf5'. Detail: [errno 2] No such file or directory
image

@ddelange ddelange force-pushed the build-matrix branch 3 times, most recently from 5680eb9 to 2136629 Compare September 4, 2023 08:28
@maartenbreddels
Copy link
Member

readthedocs is green again and looks good

😍

@ddelange
Copy link
Author

ddelange commented Jul 4, 2024

@maartenbreddels can you cancel some of the older Actions runs here, to free a slot for the latest commit?

@maartenbreddels
Copy link
Member

Maybe include

concurrency:
  group: ${{ github.ref }}
  cancel-in-progress: true

like in pythonpackage.yaml ?

@ddelange
Copy link
Author

ddelange commented Jul 4, 2024

yep! added a slightly more sophisticated version

@setu4993
Copy link

setu4993 commented Jul 4, 2024

Exciting to see all those ✅s!

ci/conda-env.yml Outdated
@@ -54,3 +54,5 @@ dependencies:
- python-utils
- progressbar2
- zipp<3.16.0
- pip:
- lightgbm>=4.0.0
Copy link
Author

@ddelange ddelange Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: flaky micromamba

I could replace setup-micromamba with setup-python:

you often see packages go pip install pandas[dev]. we could move this stuff into vaex[dev] and deprecate this file? and github actions goes pip install -e .[dev]?

on a side-note: shouldn't the lightgbm>=4.0.0 constraint go into vaex-ml pyproject.toml anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was there to quickly and reliably create a conda environment. This previously was impossible to do with pip.
I'm afraid that trying this will take a lot of time. Maybe we can merge this, do a release, and try to do this separately in a new PR?

Comment on lines +96 to +100
- name: Upload release assets
if: github.event_name == 'release'
uses: softprops/[email protected]
with:
files: packages/vaex-core/dist/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using the release feature of github, but release on tags, looking at the docs, it seems we should have a similar if as 'Publish a Python distribution to PyPI' right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this step will upload wheels to a github release, eg https://github.com/ddelange/python-magic/releases/tag/0.4.28.post7

it is a no-op if there is no release

@maartenbreddels
Copy link
Member

One thing i'd like to see, but I'm not sure where this would go. Is to try to do a simple import vaex; vaex.example() for the build wheels, so we know the wheels are valid.
Not sure if this is easy to do, I'm just afraid that after this many changes we might see a broken release.

@ddelange I believe you've been testing with this branch, so if you are confident that is not the case, let me know and we can skip this step.

I think we should also expand the test matrix, because although we make the wheels for 3.12, we did not run the test suite.

@ddelange
Copy link
Author

ddelange commented Jul 5, 2024

were the musllinux wheels ever tested on an alpine box? musllinux smoketest is currently failing due to a missing pyarrow wheel: apache/arrow#40177

see also last three commits. i would suggest either skipping these platform smoketests, or skipping musllinux wheels. building pyarrow from source is a bit of a pain i think. any thoughts?

@maartenbreddels
Copy link
Member

let's skip it (but maybe add comments on how to enable it when possible in the future?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants